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ABSTRACT 

This  thesis  compares  three  types  of  models  developed  to  predict  overhead  costs 
for  seven  government  aerospace  contractors.  The  methodologies  utilized  to  develop 
the  models  include  generalized  least  squares,  univariate  Box-Jenkins,  and  multivariate 
Box-Jenkins  procedures.  The  results  of  those  models  are  compared  using  three 
measures  of  effectiveness:  correlation  coefficient  between  actual  and  predicted  values, 
root  mean  squared  error  divided  by  the  mean  of  the  actuals,  and  mean  absolute 
percentage  error  (in  percent).  As  was  expected,  the  univariate  Box-Jenkins  method 
produced  short  term  forecasts  which  were  superior  to  these  of  the  least  squares 
regression  models.  However,  the  regression  forecasts  were  highly  accurate  and  were 
considerably  less  expensive  to  obtain.  Only  one  multivariate  Box-Jenkins  model  could 
be  developed.  The  results  of  this  model  were  marginally  superior  to  the  related 
regression  model  and  significantly  inferior  to  the  univariate  Box-Jenkins  model  for  the 
same  contractor. 
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I.  INTRODUCTION 

Overhead  costs  often  constitute  a  large  portion  of  total  product  costs  in  many 
industries.  In  the  case  of  government  aerospace  contractors,  overhead  costs  may 
comprise  as  much  as  fifty  percent  of  the  contract  cost  negotiated  by  the  Federal 
government.  Historically,  overhead  costs  have  been  predicted  by  applying  estimated 
overhead  rates  to  estimated  labor  hours  in  several  categories  of  operations.  These 
category  totals  are  summed  to  arrive  at  a  total  overhead  cost  for  a  particular  product. 
This  procedure  of  estimating  overhead  is  greatly  affected  by  changes  in  the  level  of 
operations.  Fluctuating  production  rates  often  lead  to  overhead  rates  which  display  a 
lagged  relationship  between  the  applied  rates  and  actual  overhead  costs.  Consequently, 
this  particular  method  of  predicting  overhead  costs  produces  estimates  which  are 
inadequate. 

Alternative  methods  for  predicting  overhead  have  been  proposed.  In  many  cases, 
these  methods  attempt  to  establish  a  direct  relationship  between  costs  and  factors 
related  to  direct  production.  Thus  the  need  for  reliance  on  estimated  overhead  rates  is 
eliminated.  Various  least  squares  regression  models  have  been  proposed  in  attempts  to 
establish  the  desired  direct  relationship.  One  such  model  was  developed  with  simplicity 
in  application  as  a  strong  consideration  [Ref.  1:  p. 7],  This  allowed  users  of  varying 
degrees  of  statistical  familiarity  to  apply  the  model  in  actual  work  conditions.  This 
model  intentionally  utilized  the  minimum  number  of  explanatory  variables  necessary  to 
achieve  accurate  results.  Several  applicable  independent  variables  were  considered  and 
observations  of  direct  personnel  demonstrated  the  strongest  relationship  with  overhead 
costs  [Ref.  1:  p.  20].  As  with  any  economic  trend  data,  autocorrelation  must  be 
suspected.  The  use  of  quarterly  observations  in  the  model  required  testing  for  first 
order  AR(1)  and  fourth  order  AR(4)  autoregressive  processes.  Appropriate  tests  and 
corrections  were  incorporated  into  the  model  to  ensure  the  absence  of  bias  in  the 
standard  errors  of  the  coefficients  and  in  the  R-squared  statistic  [Ref.  2:  p. 283].  The 
results  of  this  model  are  included  for  comparison  purposes. 

A  second  alternative  to  the  use  of  estimated  overhead  rates  is  the  Box-Jenkins 
method  of  forecasting.  This  method,  which  is  designed  to  produce  highly  accurate 
short  term  forecasts,  allows  for  a  wide  range  of  possible  models  to  apply  to  a  particular 


economic  series  [Ref.  3:  p.  11].  The  degree  of  statistical  sophistication  required  to  apply 
this  method  is  far  greater  than  that  required  for  the  regression  model  mentioned  above. 
In  addition,  the  use  of  the  Box-Jenkins  forecasting  method  requires  far  greater 
computer  resources  during  the  three  stages  of  identification,  estimation,  and  forecasting 
than  most  regression  models.  Consequently,  the  Box-Jenkins  method  has  not  been 
applied  as  a  forecasting  tool  in  many  cases  in  which  it  would  be  a  logical  alternative. 

The  Box-Jenkins  transfer  function  (multivariate  Box-Jenkins  method)  utilizes 
deviations  from  appropriate  means  of  an  input  (X)  and  of  an  output  (Y)  to  establish  a 
relationship  from  which  forecasts  can  be  made.  The  connecting  tool  between  these  two 
series  is  a  linear  differential  equation.  This  extremely  complicated  forecasting  tool 
requires  extensive  expertise  on  the  part  of  the  statistician.  Once  again,  this  highly 
effective  forecasting  tool  has  been  underutilized. 

This  paper  attempts  to  develop  usable  least  squares  regression,  Box-Jenkins,  and 
Box-Jenkins  transfer  function  forecasting  models  for  seven  government  aerospace 
contractors.  The  effectiveness  of  each  model  is  measured  by  withholding  the  last  four 
data  observations  during  the  model  development  phase  and  using  the  model  to  forecast 
the  withheld  observations.  The  deviation  of  the  actuals  from  the  predicted  values  have 
been  indicated  with  three  measures  of  effectiveness:  correlation  coefficient  between 
actual  and  predicted  values,  root  mean  squared  error  divided  by  the  mean  of  the 
actuals,  and  mean  absolute  percentage  error  (in  percent.)  The  development  and  results 
of  each  model  are  included  for  comparison  purposes. 


II.  DATA 

The  data  have  been  supplied  by  seven  government  aerospace  contractors.  To 
maintain  confidentiality,  all  references  to  specific  contractors  will  be  with  the  labels  A 
through  G.  A  specific  reporting  format  was  utilized  by  each  contractor  during  data 
collection.  Quarterly  data  spanning  the  period  beginning  in  the  first  quarter  of  1979 
and  continuing  through  the  third  quarter  of  1986  was  requested  from  each  of  the  seven 
contractors.  Usable  data  were  obtained  from  each  contractor  for  large  portions  of  the 
requested  period.  However,  only  one  contractor  was  able  to  supply  all  thirty-one 
observations. 

In  the  reporting  format,  overhead  costs  were  composed  of  costs  from  three  major 
categories  which  had  been  further  refined  into  six  subcategories.  The  first  major 
category,  labor  related  costs,  was  composed  of  two  subcategories:  indirect  salaries  and 
fringe  benefits.  The  second  major  category  was  facility  costs.  The  third  major 
category  is  composed  of  three  subcategories:  electronic  data  processing  costs, 
independent  research  and  development  and  bid  and  proposal  costs,  and  all  other 
overhead  costs. 

All  components  of  overhead  costs  were  converted  to  constant  1984  fourth 
quarter  dollars.  The  labor  related  cost  categories  adjustment  was  accomplished 
through  the  application  of  the  Bureau  of  Labor  Statistics  SIC  372  price  index  for 
production  worker  average  hourly  wages  for  the  aircraft  and  parts  industry.  In  the 
case  of  this  index,  monthly  indices  were  averaged  to  produce  quarterly  indices.  Gross 
National  Product  Deflater  indices  published  by  the  Bureau  of  Economic  Analysis  were 
applied  to  the  two  remaining  major  categories.  Facility  costs  were  adjusted  with  the 
GXPD  gross  private  domestic  fixed  nonresidential  investment  index.  The  GNPD 
personal  consumption  services  index  was  utilized  to  adjust  the  final  major  category.  As 
with  all  indices,  those  used  were  imperfect.  They  were  chosen  because  they  provide  the 
best  adjustments  for  inflation  among  all  readily  available  and  relevant  indices. 

Data  pertaining  to  direct  production  were  obtained  from  each  contractor.  The 
only  direct  production  data  set  utilized  in  the  analysis  was  direct  labor  personnel.  This 
category  of  data  did  not  require  adjustment  for  inflation. 
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III.  MODEL  DEVELOPMENT 

A.       THE  REGRESSION  MODEL 

The  model  utilized  in  this  project  was  designed  at  the  request  of  the  Naval  Air 
Systems  Command  as  part  of  its  contractor  overhead  tracking  project.  The  model 
capitalized  on  the  explanatory  and  prediction  properties  of  least  squares  theory. 
Individual  regression  models  were  developed  for  each  contractor  in  an  effort  to 
accurately  predict  future  overhead  costs.  Simplicity  in  application  has  been  a  key 
factor  in  the  model's  development. 

The  application  of  least  squares  to  economic  trend  data  almost  immediately 
implies  autocorrelation  in  the  error  terms  of  the  regression.  With  the  presence  of 
autocorrelation,  the  estimates  of  the  coefficients  are  unbiased  and  consistent. 
However,  they  are  not  efficient.  The  estimate  for  the  variance  of  the  coefficients  are 
biased.  Positively  autocorrelated  errors  produce  a  coefficient  variance  which  is 
underestimated  because  of  a  downward  bias  in  the  estimate  of  the  variance.  The 
downward  bias  will  produce  a  confidence  interval  which  is  narrower  than  it  should  be 
for  each  coefficient.  For  this  reason,  tests  of  the  null  hypothesis  that  the  coefficient  is 
equal  to  0  will  be  rejected  in  instances  in  which  it  should  be  accepted.  Likewise, 
autocorrelated  errors  will  cause  exaggerated  R  and  F  statistics  when  an  ordinary  least 
squares  model  is  applied. 

The  effects  of  autocorrelation  can  be  eliminated  through  the  application  of 
generalized  least  squares  (GLS)  procedures.  The  application  of  GLS  to  autocorrelated 
data  produces  estimators  of  the  coefficients  which  possess  the  properties  of  maximum 
likelihood  estimators.  Therefore,  the  GLS  estimators  of  the  regression  coefficients  and 
the  variances  of  these  coefficients  will  be  unbiased,  consistent,  and  asymptotically 
efficient.   This  will  lead  to  more  reliable  estimates  of  R^  and  F.   [Ref.  4:  pp.  302-31 1] 

This  project  examined  quarterly  data.  For  this  reason  first  order  AR(1)  and 
fourth  order  AR(4)  autoregressive  processes  are  suspected.  AR(1)  processes  are 
detectable  with  a  Durbin  Watson  Test.  In  those  cases  where  AR(1)  is  present,  the  data 
has  been  transformed  in  the  following  manner: 

-7      *  -7/1         "^     ?N  1     2 

Z*  =  Zt-y^]ZtA,  (eqn3.1) 
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:t  =  2,3,4 T). 


where 


Ietet-1 

Pi   = 

t  =  2 

T-l 

Ve2 

;t  =  2,3 T)  (eqn  3.2) 


where 


t=2 


et  =  residuals  from  the  OLS  regression. 


/N 


In  equation  3.2,  pi  defined  in  this  manner  is  the  two  stage  Prais-Winsten 
estimator  derived  by  Park,  and  Mitchell  [Ref.  5].  The  AR(4)  process  detected  in  the 
model,  a  special  form  of  the  general  AR(4)  process,  is  written  as 


/s 


£t=P4£t-4  +  uf  (eqn  3.3) 


The  general  form  for  AR(4)  processes  is 


et  =^£t. ,  +*p2£t-2  +/P3£t-3  +yp4£t-4  +  ur 


(eqn  3.4) 


In  both  cases 

l)t  ~  N(0,cr2). 

In  this  model,  the  effects  of  the  three  prior  quarters  are  assumed  to  be  negligible 
while  the  effect  of  the  quarter  one  year  previous  is  considered  significant.  This  special 
form  of  AR(4)  process  is  detectable  with  a  Wallis  Test  [Ref.  6].  If  AR(4)  processes 
were  detected,  the  data  sets  were  transformed  in  the  following  manner 

Zt   =Zt([  -P4  > 

(t=  1.2,3,4), 
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Zt   =Zt  -  p4Zt.j  (eqn  3.6) 

(t=  5,6,7 T).' 


where 


T~(l  -  .5dj)  +  K2 

p4  =  .  (eqn  3.7) 

T2-K2 


In  equation  3.7,   T  is  equal  to   the   number  of  observations,    K   is   the  number  of 
parameters  to  be  estimated,  and  d^  is  the  Wallis  Test  Statistic  written  as 

T 

I  (et  -  et.4)2 
t=5 

d4  =  (t=  1,2 T)  (eqn  3.8) 

T 

2 


t=  1 


where 


et  =  residuals  from  the  OLS  regression  model. 


This  particular  estimator,  derived  by  Theil  and  Nagar  [Ref.  7:  p.  287],  was  found 
to  be  the  most  efficient  among  nine  alternative  estimaters  applied  directly  to  these  data 
sets.    [Ref.  S:  p.  49] 

After  each  OLS  model  was  computed,  the  Wallis  and  the  Durbin  Watson 
Statistics  were  examined  for  the  presence  of  AR(4)  or  AR(1)  processes.  Appropriate 
transformations  were  made  and  the  models  were  reestimated.  At  the  conclusion  of  the 
transformation  and  reestimation  phase,  all  traces  of  autocorrelation  had  been  removed 
from  the  GLS  models.  In  each  case,  the  residuals  were  examined  and  all  tests  for 
normality  were  accepted. 

The  following  procedure  was  applied  to  each  of  the  seven  contractors.  A  detailed 
presentation  of  the  results  will  be  made  for  contractor  A.  The  results  from  the 
remaining    contractors    will    be    made    in    a    summarized    fashion   with    appropriate 
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comments.  Direct  comparison  of  the  regression  models  between  two  contractors  does 
not  indicate  relative  efficiency.  Aside  from  the  organizational  differences  within  each 
firm,  each  of  these  contractors  specializes  in  a  particular  branch  of  the  aerospace 
industry.  Examples  include  fixed  wing  aircraft  and  aircraft  engine  producers.  For 
these  reasons,  comparison  of  the  models  between  contractors  is  inappropriate. 

Contractor  A  supplied  data  spanning  twenty-six  quarters.  Therefore,  all  three 
models:  regression.  Box-Jenkins,  and  Box-Jenkins  transfer  function,  are  based  on 
twenty-two  observations.  The  remaining  four  observations  are  withheld  during  the 
model  development  stage  and  are  used  for  comparison  purposes  with  predicted  values 
during  the  forecasting  stage.  The  results  of  the  OLS  and  GLS  models  are  presented  in 
Table  1.  The  adjusted  R"  value  of  .1823  and  the  F  value  of  5.93  indicate  that  the  OLS 
model  is  poor.  The  Wallis  Statistic  of  .5S01  indicates  the  presense  of  the  special  form 
of  AR(4)  suspected  in  the  data.  The  appropriate  transformation  to  correct  for  AR(4) 
was  made  and  the  model  improved  significantly:  adjusted  R~=  .8718  and  F=  150.6569. 
At  this  point,  the  residuals  indicated  the  effects  of  first  order  autocorrelation.  A 
second  transformation  was  made  and  the  model  improved  slightly:  adjusted  R~=.8787 
and  F=  160.3358.  Both  forms  of  autocorrelation  had  been  completely  removed  at  the 
completion  of  the  second  transformation. 


TABLE  1 

REGRESSION  MODELS  FOR  CONTRACTOR  A 

OLS 

GLS  AR(4) 

GLS  AR(1) 

7 

Adjusted  R- 

.183 

.872 

.879 

F  Statistic 

5.921 

150.657 

160.336 

Intercept: 

54778.277 

11487.623 

3047.305 

Standard  Error: 

68215.152 

5924.463 

2438.277 

Slope: 

13.960 

15.701 

15.946 

Standard  Error: 

5. "37 

1.279 

1.259 

Durbin  Watson  Statistic: 

1.618 

.644 

1.991 

Wallis  Statistic: 

.580 

1.351 

1.432 

Estimate  of  Pj 

.187 

.694 

.080 

Estimate  of  p^ 

.723 

.334 

.294 
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The  regression  models  for  the  remaining  six  contractors  are  depicted  in  Table  2. 
A  single  transformation  was  required  to  eliminate  all  indications  of  autocorrelation  in 
the  cases  of  contractors  B  and  C.  The  required  transformation  removed  the  effects  of 
an  AR(4)  process.  The  data  supplied  by  Contractor  E  was  free  of  the  effects  of 
autocorrelation  and  the  model  listed  in  Table  2  is  the  OLS  model  developed  for  this 
contractor.  The  remaining  contractor  data  sets  were  transformed  twice  to  achieve  the 
desired  residual  characteristics.  All  of  the  models  derived  during  this  stage  possess 
good  forecasting  capabilities  when  utilized  to  predict  the  four  data  points  which  were 
withheld.    This  will  be  discussed  in  the  next  chapter. 

B.       BOX-JENKINS  METHOD 

The  Box-Jenkins  method  of  forecasting  is  comprised  o[  three  stages: 
identification,  estimation,  and  forecasting  [Ref.  9:  p.  19].  In  the  identification  stage,  the 
series  is  often  differenced  to  achieve  stationarity  about  a  mean  (usually  0).  During  the 
development  of  these  models,  differencing  of  the  order  of  one  period  (regular 
differencing)  or  four  periods  (seasonal  differencing)  was  considered.  As  is  standard 
with  Box-Jenkins  methodology,  no  more  than  two  differencing  corrections  were 
required  for  any  model  [Ref.  10:  p.  125].  With  one  exception,  a  stationary  series  was 
achieved  through  the  application  of  regular,  seasonal,  or  a  combination  of  both  types 
of  differencing  for  each  of  the  contractors.  Contractor  D  data  were  already  stationary 
and  did  not  require  differencing.  The  autocorrelation  (ACF)  and  partial 
autocorrelation(PACF)  functions  of  this  stationary  series  are  analyzed  to  determine  the 
horizontal  subpatterns  within  the  series.  The  stationary  trend  can  be  specified  as  a 
linear  combination  of  past  series  values  (autoregressive  terms),  a  linear  combination  of 
past  random  errors  (moving  average  terms),  or  a  combination  of  both.  The 
determination  of  the  appropriate  number  and  specific  lag  o[  the  autoregressive  and 
moving  average  terms  is  made  during  the  analysis  of  the  ACF  and  PACF.  Spikes  on 
the  ACF  accompanied  by  trends  on  the  PACF  resembling  exponential  decay  for  the 
same  lag  indicate  the  appropriateness  of  a  moving  average  parameter  at  that  lag. 
Likewise,  spikes  on  the  PACF  accompanied  by  an  exponentially  decaying  trend  on  the 
ACF  signal  the  need  to  include  an  autoregressive  term  at  that  lag.  Spikes  on  the  ACF 
and  PACF  are  taken  to  be  correlation  values  for  a  given  lag  which  are  statistically 
different  from  0.    Once  the  character  of  the  trend  is  identified,  the  estimation  phase  is 
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TABLE 

2 

SUMMARY  OF  CONTRACTOR  REGRESSION  MODELS 

REGRESSION  MODEL  fo 

-CONTRACTOR   B 

Model: 

OVRHD(B)  = 

a  + 

b   PERSONNEL(B) 

Adjusted  R" 

.927 

F  Statistic 

243. SSI 

Intercept: 

11978.143 

Standard  Error: 

5952.942 

Slope: 

15.215 

Standard  Error: 

.974 

REGRESSION  MODEL  for  CONTRACTOR  C 

Model: 

OVRHD(C)  = 

a  + 

b   PERSONNEL(C) 

Adjusted  R" 

.840 

F  Statistic 

106.326 

Intercept: 

327.038 

Standard  Error: 

3911.146 

Slope: 

8.702 

Standard  Error: 

.844 

REGRESSION  MODEL  foi 

•CONTRACTOR   D 

Model: 

OVRHD(D)  = 

a  + 

b   PERSONNEL(D) 

Adjusted  R- 

.726 

F  Statistic 

64.448 

Intercept: 

13864.996 

Standard  Error: 

3084.697 

Slope: 

7.999 

Standard  Error: 

.996 

entered   to   determine   the  parameter   values.     A   first   order  autoregressive   series   is 
expressed  as  follows 


zt=  e0+(Pzt.1  +  Ar 


(eqn  3.9) 
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TABLE 

2      '                                                      ! 

SUMMARY  OF  CONTRACTOR  REGRESSION  MODELS  (CONT'D.) 

REGRESSION  MODEL  foi 

-CONTRACTOR   E 

Model: 

OVRHD(E)  = 

a  + 

b   PERSONNEL*  E) 

Adjusted  R" 

.527 

F  Statistic 

26.590 

Intercept: 

4976.203 

Standard  Er 

"or: 

1856S.690 

Slope: 

14.949 

Standard  Error: 

2.S99 

REGRESSION  MODEL  foi 

-CONTRACTOR   F 

Model: 

OVRHD(F)  = 

a  + 

b   PERSONNEL(F) 

Adjusted  R"- 

.707 

F  Statistic 

63.739 

Intercept: 

884.075 

Standard  Er. 

:or: 

1083.769 

Slope: 

13.012 

Standard  Error: 

1.629 

REGRESSION  MODEL  for  CONTRACTOR  G 

Model: 

OVRHD(G)  = 

a  + 

b   PERSONNEL(G) 

Adjusted  R2 

.857 

F  Statistic 

133.155 

Intercept: 

4716.544 

Standard  Error: 

3020.132 

Slope: 

20.545 

Standard  Error: 

1.780 

where 


6q=  series  mean 


<Pj  =    weighting  of  the  previous  period  value 
At=  white  noise  ~N(0,cr^-). 
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An  alternative  form  of  the  model  in  terms  of  deviation  from  the  series  mean  is 

Zt     =  (pZt.j+  At  (eqn  3.10) 


where 


zt   -  zt-e0. 


This  expression  can  be  expanded  to  include  any  number  of  past  series  values  and  is 
written  as  follows: 

Zt     =  <PlZt-l    +(P2Zt-2,  +  -  +  (PpZt-p*  +  At-  ^n  lll) 

The  backward  shift  operator,  B,  is  often  utilized  to  express  the  relationship  of  the 
present  term  to  the  relevant  past  terms.  The  operator  is  a  symbolic  indicator  and  does 
not  imply  multiplication  of  Zt  by  a  constant  B.  Its  use  indicates  a  desire  to  express 
past  consecutive  terms  of  a  series  as  an  ordered  group.  The  operator  possesses  the 
following  relationship: 

BZt  -  ZM 

B  Zt  =  Zt_2 

BmZt  =  Zt.m. 

Using  the  operator,  the  autoregressive  process  of  order  p  is  expressed  as 

(p(B)Zt*  =  At  (eqn  3.13) 

with 

9(B)  =  (l-(p1B-<p2B2-...-(p  BP). 


The  moving  average  model  assumes  that  the  series  can  be  expressed  as  a 
weighted  average  of  past  successive  white  noise  terms.  A  first  order  moving  average 
model  expressing  the  series  as  deviation  from  the  mean  of  the  series  is: 


* 


Zt    =  A^jA^j  (eqn  3.14) 

where 

At=  series  is  white  noise  ^N(0,<Ta   ) 

0j  =  coefficient  of  the  most  recent  white  noise  term. 
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This  basic  relationship  can  be  expanded  to  include  any  number  of  past  terms.    A 
moving  average  process  of  order  q  is  written  as: 

Zt*  =  At-G1At.1-e2At.2-...-eqAt.q.  (eqn  3.15) 

Using  the  backward  shift  operator  this  model  is  expressed  as 

Zt*  =  9(B)At  (eqn  3.16) 

with 

6(B)  =  l-G1B-62B2-...-eqB^. 

A  combination  of  these  types  of  models  can  be  developed  and  possesses 
relationships  to  past  series  terms  and  to  past  white  noise  elements.  The  process  is 
called  a  mixed  autoregressive-moving  average  model  and  is  expressed  as: 

Zt*  =  q>1Zt.1%...  +  (PpZt.p*  +  At-eiAt.r...-eqAt.q.  (eqn  3.17) 

This  expression  is  usually  written  in  terms  of  the  backward  shift  operator, 

(p(B)Zt*  =  6(B)Ar  (eqn  3.18) 

Models  which  adequately  describe  the  data  rarely  exhibit  values  of  p  or  q  greater 
than  two  and  usually  are  less  than  two.   [Ref.  10:  p.  66] 

The  use  of  differencing  to  achieve  a  stationary  series  permits  the  use  of  the  Box- 
Jenkins  method  to  model  series  which  are  nonstationary  in  nature.  The  backward 
differencing  operator  V  is  used  to  indicate  the  following  relationship: 

7Zt  =  Zt-Zt.1  =  (l-B)Zt.  (eqn  3.19) 

The    model    now    becomes    an    autoregressive    integrated    moving    average    process 
(ARIMA)  of  order  (p,d,q.)  and  is  written: 

(p(B)VdZt  =  e(B)At.  (eqn  3.20) 
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In  this  relationship  the   superscript  d  indicates   the  number  of  times  that  regular, 
backward  differencing  was  utilized  to  achieve  a  stationary'  series. 

Seasonal  differencing  was  used  in  several  of  the  models  to  achieve  stationarity. 
The  inclusion  of  seasonal  differencing  produces  a  seasonal  ARIMA  model  described  as 
(p.d.q)  x  (P.D.Q,)  .  The  subscript  5  indicates  the  number  of  periods  contained  in  one 
season.  In  the  examination  of  quarterly  data,  there  are  four  periods  in  one  year  and 
the  subscript  becomes  a  four.  The  general  form  of  the  seasonal  ARIMA  model  is  a 
multiplicative  ARIMA  model.  The  multiplicative  nature  of  the  model  indicates  that 
the  model  contains  terms  which  are  products  of  the  regular  and  seasonal  coefficients. 
Intuitively,  this  makes  sense.  In  the  case  of  quarterly  data,  the  value  in  the  series  five 
periods  previous  to  the  present  is  included  and  has  a  coefficient  which  is  the  negative 
product  of  the  first  order  term  and  the  seasonal  term  [Ref.  10:  p.  164],  A  multiplicative 
ARIMA  model  of  order  ( 1,1,0)  *(1,1,0)4  is  expressed  as: 

Zt  -  q>Zt_1+4>1Zt_4-<p1<I>1Zt_5  +  At  (eqn3.21) 

with 

At  ~N(0,ffA2). 

Oj  =  coefficient  for  the  seasonal  term. 

The  general  expression  for  the  multiplicative  ARIMA  model  is 

(pp(B)(Dp(Bs)VdVsDZt  =  eq(B)0Q(Bs)At 

where 

0q  =  coefficient  of  the  seasonal  error  term 
<I>p  =  coefficient  for  the  seasonal  term. 


(eqn  3.22) 


Several  plausible  models  are  developed  during  the  estimation  stage  and  these 
alternative  models  can  be  compared  utilizing  the  diagnostics  provided  in  most 
computer  packages.  The  parameters  and  the  associated  residuals  from  each  plausible 
model  were  examined  to  validate  the  model.  The  residuals  were  examined  for  the 
presence  of  bias  and  autocorrelation.  The  three  indicators  used  to  detect  the  presence 
of  these  conditions  were  the  residual  mean  and  variance,  the  autocorrelations  of  the 
residuals,  and  the  Q  statistic.    The  model  parameters  were  examined  for  statistical 


20 


significance  and  indications  of  high  correlation  between  each  other.  Highly  correlated 
parameters  are  usually  an  indication  of  the  inclusion"  of  unnecessary  parameters  in  the 
model  [Ref.  10:  p.  98].  The  desired  outcome  is  to  determine  one  or  more  models  which 
produces  fitted  values  as  close  as  possible  to  the  original  series.  Additionally,  it  is 
desired  that  the  models  have  as  few  parameters  as  possible  [Ref  9:  p.  17].  The  final 
stage  of  the  Box-Jenkins  method,  forecasting,  enables  the  user  to  project  the  series  into 
the  future.  Often.  95%  confidence  intervals  for  the  projected  values  are  provided  by 
the  computer  package.  The  computer  package  utilized  during  this  analysis  was 
GRAFSTAT.  It  is  a  package  being  developed  by  IB VI  and  is  installed  at  the  Naval 
Postgraduate  School  for  evaluation. 

In  general,  the  results  derived  from  the  Box-Jenkins  method  are  usually  more 
accurate  in  the  short  and  intermediate  term  than  forecasts  from  other  methods, 
including  regression  [Ref.  11:  p.  236].  The  cost  of  these  generally  superior  results  are 
measured  in  the  computer  resources  required  to  derive  the  model  and  the  expertise 
required  of  the  statistician  to  determine  an  appropriate  model.  The  procedure  allows 
for  interpretation  on  the  part  of  the  forecaster.  Two  forecasters  may  identify  different 
models  as  being  the  best  model  to  fit  the  same  data  set.  Even  so,  both  sets  of  forecasts 
may  be  highly  accurate  when  compared  to  the  future  observations  of  the  series 
[Ref.  3:  p.  11]. 

The  Box-Jenkins  models  developed  for  each  of  the  contractors  displayed  strong 
predictive  properties  when  used  to  predict  the  missing  values.  Table  3  presents  the 
differencing  required  to  achieve  stationarity  and  the  final  form  of  the  model  for  each  of 
the  contractors.  The  model  for  each  contractor  is  expressed  in  standard  Box-Jenkins 
notation  with  a  seasonal  period  of  four.  The  coefficient  values,  the  model  mean,  and 
the  standard  errors  of  these  terms  are  included  in  the  presentation. 


C.       BOX-JENKINS  TRANSFER  FUNCTION 

The  Box-Jenkins  transfer  function  is  a  procedure  which  allows  a  forecaster  to 
aggregate  the  information  contained  in  a  particular  series  (output)  with  one  or  more 
related  series  (input)  to  forecast  future  values  of  the  series.  The  relationship  which  is 
usually  identified  is  that  the  trend  present  in  the  input  series  is  reflected  in  the  output 
series  after  a  lag  of  several  periods.  Relationships  of  this  order  are  referred  to  as 
dynamic  responses.    The  aggregation  of  information  is  achieved  through  a  transfer 
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TABLE  3 

i 

SUMMARY  OF  BOX-JENKINS  MODELS 

Summary'  For  Contractor  A 

ARIMA 

(1,0.0)  x<0,1.0.)4 

AR(1) 

.753 

Standard  Error 

.157 

Mean 

2023.270 

Standard  Error 

6648.622 

Summary'  For  Contractor  B 

ARIMA 

(1,1,0)  x(0.1,0,)4 

AR(1) 

-.845 

Standard  Error 

.171 

Mean 

488.515 

Standard  Error 

1162.532 

Summary  For  Contractor  C 

ARIMA 

(1,2,0)  x(i,o,0)4 

AR(1) 

-.673 

Standard  Error 

.185 

SAR(l) 

.764 

Standard  Error 

.103 

Mean 

305.319 

Standard  Error 

416.400 

Summary  For  Contractor  D 

ARIMA 

(1,0,0)  x(0,0,0,)4 

AR(1) 

.688 

Standard  Error 

.163 

Mean 

189328.647 

Standard  Error 

6599.820 
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TABLE  3 

SUMMARY  OF  BOX-JENKINS  MODELS  (CONT'D.) 

Summary  For  Contractor  E 

ARIMA 
MA(1) 

Standard  Error 

(0,1,1)  x(0.0,0)4 
.906 
.059 

Mean 

1532.200 

Standard  Error 

397.637 

Summary  For  Contractor  F 

ARIMA 
AR(1) 

Standard  Error 

(1.1.0)  x(U,o.)4 
-.607 
.166 

SAR(l) 

Standard  Error 

.875 
.129 

Mean 

-350.716 

Standard  Error 

2534.731 

Summary  For  Contractor  G 

ARIMA 

AR(1) 

Standard  Error 

( 1,0,0)  x  (0,1. 0)4 
.854 
.103 

Mean 

1125.453 

Standard  Error 

18744.529 

function  model  [Ref.  9:  p.  13].  This  procedure  is  comprised  of  the  same  three  phases  as 
the  univariate  Box-Jenkins  method.  During  the  identification  stage,  differencing  of  the 
series  is  usually  recommended  in  an  effort  to  achieve  a  stationary  series.  However,  a 
linear  combination  of  elements  in  the  output  series  often  may  be  stationary,  and 
differencing  of  all  of  the  series  in  the  model  can  cause  complications  in  identifying  the 
appropriate  model  [Ref.  12]. 

The  data  trends  may  be  tentatively  identified  and  prewhitened  in  an  effort  to 
achieve  an  input  series  which  strongly  resembles  white  noise.    This  effort  is  made  in  an 
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attempt  to  improve  the  mterpretability  of  the  cross  correlation  function  (CCF).  If  the 
input  series  is  not  prewhitened,  the  CCF  often  cannot  be  interpreted  [Ref.  13:  p.  243]. 
A  series  which  can  be  represented  by  an  ARIMA  model 

Xt  =  (l-(p1B-...-(ppBP)-1(l-eiB-...-6qB(i)Ar  (eqn  3.23) 

or  as  expressed  with  the  backward  shift  operator 

Xt=(px-1(B)ex(B)Ar  (eqn  3.24) 

can  be  prewhitened  by  inverting  the  model.  The  prewhitened  version  of  the  series  is 
expressed  as  follows: 

at  =  (px(B)9x'1  (B)Xt.  (eqn  3.25) 

The  same  transformation  is  applied  to  the  output  series  culminating  in  the  following 
expression: 

Bt  =  (PX(B)6X-1  (B)Zt.  (eqn  3.26) 

The  relationship  between  the  input  and  output  series  is  described  in  an  impulse 
response  function  of  the  form 

Zt  =  O0Xt  +  U1Xt.1  +  U2Xt-2+-  +  ,lt  (eqn  3.27) 

where 

o   is  the  impulse  response 
T(t   is  the  random  noise  term. 

Expressing  the  impulse  response  function  in  terms  of  the  prewhitened  series,  <*t  and  Bt, 
the  function  becomes 

Bt  =  0(B)at  +  £t.  (eqn  3.28) 

where  £t  is  the  transformed  noise  series  defined  by 
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£t  =  <px(B)6x-1  (B)nt  ..  (eqn3.29) 

The  impulse  response  weights  can  be  determined  through  examination  of  the 
CCF.   The  following  relationship  is  utilized  in  this  regard: 

PaB(k)SB 

Uk  =  k  =  0.1,2....     (eqn  3.30) 

where  a 

pag(k)  =    correlation  coefficient  between  the 

at  and  Bt  series  for  the  k     element 
Sa  =  estimated  standard  deviation  of  the  at  series 
Sd  =  estimated  standard  deviation  of  the  Bt  series. 

Once  estimates  of  u^  are  obtained,  identification  of  those  elements  of  the  impulse 
function  which  are  statistically  significant  enables  the  forecaster  to  identify  an 
appropriate  model  to  be  used  as  a  basis  for  the  transfer  function.  From  these 
estimates,  simultaneous  equations  can  be  formed  and  solved  to  provide  preliminary 
estimates  of  the  parameters  of  the  model.  Since  the  impulse  responses  provided  by  the 
CCF  are  statistically  inefficient  in  general,  the  proposed  model  is  used  as  a  starting 
point  to  be  fitted  by  some  more  elaborate  means. 

This  procedure  is  extremely  complicated  and  places  a  high  demand  on  computer 
resources  and  the  skills  of  the  forecaster.  The  results  should  be  indicative  of  the  cost  of 
attaining  them.  In  this  particular  application  of  the  procedure,  several  difficulties  were 
encountered.  A  computer  package  to  perform  the  entire  procedure  was  not  available. 
The  GRAFSTAT  package  mentioned  previously  does  contain  a  CCF  routine. 
Therefore  the  inefficient  estimates  of  the  impulse  responses  obtained  from  the 
examination  of  the  CCF  could  be  used  to  form  simultaneous  equations.  However,  this 
would  be  the  most  elaborate  means  of  determining  the  parameters.  The  data  from 
each  contractor  were  examined  in  all  combinations  of  undifferenced.  regularly 
differenced,  and  seasonally  differenced  forms.  In  all  cases,  the  data  were  prewhitened 
before  examination.  Only  one  CCF  indicated  the  presence  of  an  impulse  response 
value  which  was  statistically  different  from  zero.  This  was  the  prewhitened  and 
undifferenced  series  for  Contractor  D.    Therefore,  the  only  input  value  used  in  the 


25 


forecast  of  the  output  was  the  present  input  value.    The  nature  of  this  model  led  to  a 
single  equation  of  the  following  form: 


Uq  =   14.39  =  (Dq. 


(eqn  3.31) 


From  equation  3.31,  the  final  multivariate  Box-Jenkins  model  was  determined  to 


be: 


Yt  =   14.39  Xt. 


(eqn  3.32) 


The  model  parameters  are  listed  in  Table  4.  The  forecasts  made  with  this  model  are 
inferior  to  those  obtained  from  the  univariate  Box-Jenkins  model.  As  will  be  discussed 
in  the  results  chapter,  the  presence  of  the  input  series  (X*)  does  not  produce  a 
regression  model  or  a  transfer  function  which  outperforms  the  univariate  Box-Jenkins 
model  for  contractor  D.  The  inability  of  the  procedure  to  develop  adequate  models  for 
the  majority  of  the  contractors  may  be  attributable  to  the  relatively  small  sample  size 
of  each  data  set.  The  longest  series  available  for  analysis  was  twenty-seven 
observations  and  the  majority  of  the  contractors  supplied  data  in  the  range  of  twenty 
to  twenty-four  observations.  Generally,  sample  sizes  larger  than  sixty  observations  or 
data  spanning  at  least  eight  complete  seasonal  periods  are  recommended  as  a  minimum 
number  of  data  points  for  the  method  to  perform  well  [Ref.  3:  p.  6]. 


TABLE  4 
SUMMARY  FOR  CONTRACTOR  D 


Pap 
V 
SB 
»0 


.49 

1034.734 

30387.40016 

14.39 
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IV.  PREDICTION  METHODOLOGY  AND  RESULTS 


Each  model  was  used  to  predict  the  four  withheld  data  points.  For  the  GLS 
models  actual  X's  for  the  four  periods  were  available  and  used.  In  those  cases  where 
only  one  form  of  autocorrelation  was  present  in  the  data,  a  transformation  of  the 
following  type  was  made  to  the  original  data  prior  to  estimation: 

Yt*  ='piYt.i  +  (Xt-/piYt.i)  (eqn4.1) 

(t  =  T-3,T-2.T-l,T) 
(i=  1  or  4)  . 

The  majority  of  the  GLS  model  required  two  transformations  to  remove  all  forms  of 
autocorrelation.  In  those  cases,  the  X's  and  Y's  were  transformed  in  the  following 
manner  before  the  final  GLS  model  was  used  to  predict  the  withheld  values. 

Transform  the  data  for  AR(4): 

Yt*  =  Yt-Vt-4 

Xt*  =  Xt-'p4XM  (eqn4.2) 

(t=  5,6,7,.. .,T)  . 

Transform  these  new  data  to  remove  the  effects  of  AR(1): 

*#        *    /\     * 

\~Ym*-%Y  t-1 

Xt     =X*r:,-p1  X  t.j  (eqn4.3) 

(t=l,2,3,...,T). 

Substitution  of  equation  4.2  into  equation  4.3  allows  both  equations  to  be 
combined  as  follows: 

Yt**  =  Yf^4Yl.4-'pVl(Yt.1^p4Yt.5) 
Xt"'"  =  Xt-^4Xt.4-/p1(Xt.1^4Xt.5)  (eqn  4.4) 

(t=6,7,8,...,T). 
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These  equations  can  be  simplified  to  the  following  form: 


Yt     =YrPlYt-rP4Yt-4  +  PlP4Yt-5 

Xt""  =  Xt-PlXt-rP4Xt-4  +  PlP4Xt-5  ^  4-5) 

(t=  6.7,8,.  ...T). 

Assuming  that  all  forms  of  autocorrelation  have  been  removed,  the  following 
relationship  holds: 


Yt     =Xt     B+  ut  (eqn4.6) 

ut  "  N(0,«T2) 
(t  =  6,7,8,..  .,T). 


The  entire  transformation  can  be  made  in  one  step  and  forecasts  can  be  made 
with  the  following  equation. 

Yt =^  1  Yt- 1  +  P4  Yt-4JP  1 P4YtA5  + 

(Xt-p1Xt.[p4Xt.4-/p1  p4Xt.5)B  (eqn  4.7) 

(t=6,7,8,...,T). 

This  set  of  equations  pertained  to  the  instance  in  which  AR(4)  was  removed  first  and 
AR(1)  was  removed  during  the  second  application  of  GLS.  These  equations  were 
developed  by  substituting  the  transformation  for  AR(4)  into  the  equation  for  the 
removal  of  AR(1)  processes.  A  similar  expression  was  derived  for  those  cases  in  which 
the  correction  for  AR(1)  preceded  the  removal  of  AR(4)  processes.  In  the  case  of  the 
Y's,  this  procedure  required  an  iterative  process  to  determine  the  last  three  values  in 
the  Y  vector. 

The  Box-Jenkins  models  developed  for  each  contractor  were  used  to  produce 
forecasts  which  were  compared  to  the  withheld  data  points.  These  forecast  values  were 
provided  by  the  GRAFSTAT  package.  The  package  also  provided  ninety-five  percent 
prediction  intervals  for  each  forecast.  A  graphical  presentation  of  the  forecast  is 
provided  with  an  analysis  of  each  contractor's  data  in  the  remainder  of  this  chapter. 

Forecasts  were  made  with  the  multivariate  Box-Jenkins  model.  These  were  made 
by  substituting  the  known  values  in  the  input  (Xt)  series  into  equation  3.32  and 
completing  the  computations. 
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The  forecasting  capabilities  of  each  model  were  measured  by  three  comparison 
indicators:  correlation  coefficient  between  the  actual  and  predicted  values,  root  mean 
square  divided  by  the  mean  of  the  actuals,  and  the  mean  absolute  percentage  error. 
An  analysis  of  the  data  and  a  presentation  of  the  prediction  results  for  each  contractor 
follows. 

A.       CONTRACTOR  A 

Contractor  A  supplied  twenty-two  data  points.  The  data  was  categorized  and 
deflated  to  constant  dollars  as  specified  in  Chapter  II.  Graphical  presentations  of  the 
raw  data  is  presented  in  the  top  two  graphs  of  Figure  4.1  .  The  upper  left  hand  graph 
is  a  presentation  of  the  overhead  cost  across  time  (twenty-two  consecutive  quarters). 
Likewise,  the  upper  right  hand  graph  is  a  display  of  the  direct  personnel  trend  across 
time.  The  overhead  cost  versus  time  graph  displays  a  sharp  decline  in  the  first  four 
quarters  and  a  somewhat  cyclic  increase  thereafter.  The  trend  is  increasing  in  general. 
The  direct  personnel  versus  time  graph  indicates  a  similar  trend  in  general,  but  does 
not  appear  to  be  influenced  by  a  seasonal  trend  to  the  extent  that  the  overhead  cost 
trend  is.  The  lower  left  hand  graph  displays  the  relationship  of  overhead  cost  versus 
direct  personnel.  The  weak  relationship  is  depicted  graphically  and  in  the  results  of  the 
OLS  regression  (adjusted  R  =  .183).  The  lower  right  hand  graph  in  Figure  4.1  shows 
the  relationship  of  overhead  cost  to  direct  personnel  after  both  series  have  been 
transformed  to  remove  the  effects  of  autocorrelation.  The  adjusted  R~  for  this  model 
is  .8787  as  indicated  in  Table  3.  The  graphical  portrayal  of  the  transformed  data 
suggests  that  the  GLS  model  should  be  a  dramatic  improvement  over  the  OLS  model 
as  a  prediction  tool.  A  summary  of  the  predictive  results  of  the  GLS  model  is 
presented  in  Table  5.  Actual  X's  were  available  and  used  in  making  predictions  for  the 
four  withheld  quarters. 

The  Box-Jenkins  model  developed  for  Contractor  A  was  presented  in  Chapter  3. 
A  topic  of  concern  was  the  amount  of  data  available  which  could  be  used  as  a  basis  for 
the  model.  Generally,  the  amount  of  data  needed  to  develop  accurate  models  is  fifty 
observations  with  one  hundred  preferred.  Models  can  be  developed  in  the  absence  of 
these  amounts  of  data,  but  the  forecaster  must  utilize  experience  and  past  information 
to  develop  preliminary  models  which  can  be  updated  as  more  information  becomes 
available  [Ref.  9:  p.  18].  A  graphical  portrayal  of  the  results  of  the  Box-Jenkins  model 
is  presented  in  Figure  4.2.    The  left  hand  graph  is  the  actual  overhead  series  including 
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Figure  4.1     Regression  Analysis  Graphs  for  Contractor  A. 
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Figure  4.2    Box-Jenkins  Graphs  for  Contractor  A. 
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TABLE  5 

PREDICTION  RESULTS  FOR  CONTRACTOR  A 

REGRESSION 

BOX- 
JENKINS 

Correlation  Coefficient: 

.862 

.881 

Root  Mean  Squared  Error 

Mean  of  the  Actuals: 

.0649 

.0376 

Mean  Absolute  Percentage 

Error: 

5.98 

2.79 

the  periods  for  which  predictions  were  made.  The  right  hand  graph  is  the  first  twenty- 
two  observations  and  the  four  prediction  values.  The  trend  of  the  predicted  values  can 
be  compared  to  the  actual  trend  by  mentally  superimposing  one  graph  on  the  other. 
As  indicated  in  Table  5,  the  results  of  the  Box-Jenkins  model  for  Contractor  A  is 
superior  to  the  results  of  the  GLS  model. 

Attempts  to  develop  a  multivariate  Box-Jenkins  model  were  made  and  proved  to 
be  unsuccessful.  The  cross  autocorrelation  function  (CCF)  was  plotted.  However, 
none  of  the  impulse  weights  were  statistically  significant.  All  reasonable  combinations 
of  differencing  were  examined  in  addition  to  a  CCF  in  which  no  differencing  was 
included.   The  outcome  was  the  same  in  all  cases. 

B.       CONTRACTOR  B 

Figures  4.3  and  4.4  and  Table  6  are  provided  for  Contractor  B.  Figure  4.3 
displays  the  raw  overhead  cost  and  direct  personnel  series  for  Contractor  B.  As 
indicated  in  the  upper  right  hand  graph,  the  overhead  series  has  an  increasing  trend 
accompanied  by  a  seasonal  variation.  The  direct  personnel  series  is  characterized  as  a 
consistently  increasing  series.  A  comparison  of  the  slopes  of  the  series  in  each  graph 
indicates  that  the  quarterly  direct  personnel  count  is  increasing  at  a  slightly  greater  rate 
than  the  overhead  cost  per  period  is.  The  overhead  cost  versus  direct  personnel  chart 
reveals  the  presence  of  a  relationship  between  those  two  series  which  is  stronger  than 
the  near  randomness  revealed  in  the  same  graph  for  Contractor  A.    This  is  supported 
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Figure  4.3     Regression  Analysis  Graphs  for  Contractor  B. 
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Figure  4.4     Box-Jenkins  Graphs  for  Contractor  B. 
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TABLE  6 

PREDICTION  RESULTS  FOR  CONTRACTOR  B 

REGRESSION 

BOX- 
JENKINS 

Correlation  Coefficient: 

.584 

.675 

Root  Mean  Squared  Error/ 

Mean  of  the  Actuals: 

.0952 

.0433 

Mean  Absolute  Percentage 

Error: 

8.58 

3.91 

by  the  results  of  the  OLS  model  which  are  surprisingly  good  with  an  adjusted  R  of 
.669.  After  transformation,  the  GLS  model  achieves  an  adjusted  R1-  of  .927.  There  are 
three  transformed  observations  which  are  easily  distinguished  from  the  remainder  of 
the  transformed  data.  These  observations  are  located  closely  together  on  the 
transformed  data  graph.  The  GLS  model  developed  possesses  all  of  the  indications  of 
a  statistically  significant  and  worthwhile  prediction  model.  The  adjusted  R",  F- 
statistic,  and  T-statistic  for  the  slope  are  all  significant  [Ref.  14:  p.  133].  Despite  this 
fact,  the  Box-Jenkins  model  developed  for  Contractor  B  is  superior  as  a  forecasting 
tool  in  the  range  which  is  being  examined  in  this  paper.  The  multivariate  Box-Jenkins 
model  suffered  from  the  same  short  comings  as  the  model  for  Contractor  A.  The 
impulse  weights  were  determined  to  be  statistically  insignificant. 

C.       CONTRACTOR  C 

Figures  4.5  and  4.6  and  Table  7  are  provided  for  Contractor  C.  The  overhead 
cost  and  direct  personnel  graphs  derived  from  Contractor  C  data  are  presented  in 
Figure  4.5.  The  overhead  cost  trend  appears  to  fluctuate  significantly  about  a  mean 
value  of  approximately  S90,000,000  during  the  first  seventeen  quarters  with  the 
minimum  value  of  the  series  occurring  in  the  fifteenth  quarter.  The  series  shows  a 
departure  from  this  trend  during  the  last  eight  quarters.  The  last  four  values  depicted 
on  the  top  two  graphs  are  values  which  were  not  included  during  the  model 
development  stage.     These  are  the  values  which  are  being  withheld  for  prediction 
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Figure  4.5     Regression  Analysis  Graphs  for  Contractor  C. 
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Figure  4.6    Box-Jenkins  Graphs  for  Contractor  C. 
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TABLE  7 

PREDICTION  RESULTS  FOR  CONTRACTOR  C 

REGRESSION 

BOX- 
JENKINS 

Correlation  Coefficient: 

.671 

.338 

Root  Mean  Squared  Error 

Mean  of  the  Actuals: 

.112 

.117 

Mean  Absolute  Percentage 

Error: 

8.52 

9.10 

comparison  purposes.  The  direct  personnel  graph  shows  an  increase  during  the  first 
nine  periods  followed  by  a  decrease  for  approximately  eight  periods.  At  that  point,  the 
number  of  direct  workers  employed  at  Contractor  C  increases  for  the  remaining  eight 
periods.  This  trend  matches  the  overhead  cost  trend  in  general  for  the  entire  length  of 
the  data  strings,  but  does  not  appear  to  be  influenced  by  a  seasonal  component.  The 
plot  of  overhead  cost  versus  direct  personnel  appears  nearly  random  (adjusted  R  = 
.291).  The  plot  of  the  transformed  data  displays  a  strong  direct  relationship  between 
the  two  variables.  As  the  number  of  personnel  increases,  the  overhead  cost  increases. 
This  is  apparent  in  the  GLS  model  which  has  an  adjusted  R*"  of  .840  and  an  F  statistic 
of  106.326.  The  Box-Jenkins  model  graphs  are  presented  in  Figure  4.6  .  These  graphs 
are  difficult  to  mentally  superimpose  on  each  other.  This  problem  is  caused  by  the 
bounds  of  the  95%  confidence  interval  for  the  forecast  observations.  In  the  case  of 
Contractor  C,  the  regression  model  produces  predictions  which  are  actually  closer  to 
the  actual  values  than  those  calculated  by  the  Box-Jenkins  model.  A  comparison  of 
the  predictive  results  is  presented  in  Table  7  .  The  multivariate  Box-Jenkins  model  was 
unusable  for  Contractor  C. 


D.       CONTRACTOR  D 

The  data  supplied  by  Contractor  D  spanned  twenty  periods  and  is  presented  in 
Figures  4.7  and  4.8  and  Table  8.    The  overhead  cost  versus  time  plot  indicates  that  the 
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Figure  4.7    Regression  Analysis  Graphs  for  Contractor  D. 
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Figure  4.8 


Box-Jenkins  Graphs  for  Contractor  D. 
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TABLE  8 

PREDICTION 

RESULTS  FOR  CONTRACTOR  D 

REGRESSION 

BOX- 
JENKINS 

TRANSFER 
FUNCTION 

Correlation  Coefficient: 

-.757 

.641 

.418 

Root  Mean  Squared  Error' 

Mean  of  the  Actuals: 

.197 

.0435 

.126 

Mean  Absolute  Percentage 

Error: 

18.64 

3.54 

12.18 

series  started  at  a  relatively  low  point  and  increased  rapidly  in  the  fifth  quarter.  The 
trend  remained  relatively  constant  for  six  periods  at  which  time  it  began  to  decrease 
slowly.  This  trend  remained  consistent  through  the  twentieth  quarter  which  is  the  last 
period  included  in  the  model  development  stage.  The  direct  personnel  series  is  similar 
to  the  overhead  cost  trend  with  an  abrupt  decrease  in  the  thirteenth  quarter. 
Beginning  in  the  twenty-first  quarter,  the  direct  personnel  trend  becomes  flat  and 
neither  increases  nor  decreases  for  the  remainder  of  the  series.  These  last  four  values 
are  the  known  X's  which  are  used  in  the  regression  model  and  the  multivariate  Box- 
Jenkins  models.  As  the  X's  (direct  personnel)  become  a  level  function  in  the  prediction 
interval  of  the  data,  the  Y's  (overhead  cost)  suddenly  increases.  This  departure  from 
the  previous  trend  causes  problems  with  both  of  the  models  which  utilize  the  available 
X  values  to  generate  predictions.  The  effect  of  this  departure  from  the  trend  is 
indicated  by  the  relatively  poor  results  of  these  models  as  listed  in  Table  8.  The 
multivariate  Box-Jenkins  model  is  predicated  on  a  single  impulse  weight  which  was 
significant.  This  occurred  at  a  lag  of  zero  periods.  The  model  which  was  developed 
utilized  only  the  present  period  X  value  to  predict  the  value  of  Y.  Therefore,  this 
model  was  susceptible  to  the  trend  departures  present  in  the  data. 

E.       CONTRACTOR  E 

Figures  4.9  and  4.10  and  Table  9  are  provided  for  Contractor  E.  The  regression 
model  developed  for  Contractor  E  is  unique  in  the  fact  that  it  did  not  need  to  be 
corrected  for  autocorrelation.    As  indicated  in  the  graphs  in  Figure  4.9,  both  the 
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Figure  4.9     Regression  Analysis  Graphs  for  Contractor  E. 
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Figure  4.10    Box-Jenkins  Graphs  for  Contractor  E. 
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TABLE  9 

PREDICTION  RESULTS  FOR  CONTRACTOR  E 

CONTRACTOR E 

REGRESSION 

BOX- 
JENKINS 

Correlation  Coefficient: 

-.434 

-.606 

Root  Mean  Squared  Error' 

Mean  of  the  Actuals: 

.107 

.137 

Mean  Absolute  Percentage 

Error: 

8.99 

13.83 

overhead  cost  and  the  direct  personnel  series  are  generally  increasing  with  time.  The 
overhead  cost  series  tends  to  fluctuate  above  and  below  the  general  trend  throughout 
the  length  of  the  series  with  the  most  significant  departure  from  the  general  trend 
occurring  in  the  twenty-first  to  the  twenty-fifth  data  observations.  The  direct 
personnel  series  increased  initially  in  the  first  ten  periods  and  remained  fairly  constant 
at  a  level  of  approximately  6500  workers  for  the  next  ten  periods.  This  series  shows  a 
sharp  increase  in  the  twenty-first  period  and  continues  to  increase  at  a  somewhat 
slower  rate  thereafter.  The  overhead  cost  versus  direct  personnel  graph  displays  a 
direct  relationship  between  these  two  variables  in  the  absence  of  an  autocorrelation 
correction.  The  strength  of  this  relationship  is  less  apparent  as  the  number  of 
personnel  increases.  The  four  observations  in  the  upper  right  hand  corner  of  this 
graph  are  the  last  four  observations  in  each  series  in  the  model  formulation  range. 
Their  proximity  to  each  other  is  a  function  of  the  fact  that  the  direct  personnel  series  is 
slowly  increasing  in  the  twenty-first  through  the  twenty-fourth  observations.  This 
explains  the  remoteness  of  their  placement  on  the  graph  as  they  occur  later  in  time 
than  the  large  increase  in  the  number  of  workers  that  was  recorded  in  the  twenty-first 
quarter.  The  appearance  o^  these  points  as  a  nearly  vertical  line  is  a  function  of  the 
small  increase  in  the  direct  personnel  component  o[  the  graph  and  the  large 
fluctuations  that  occurred  in  the  overhead  cost  series  in  the  twenty-first  to  twenty- 
fourth  quarters.   The  OLS  model  has  an  R2  of  .527  and  an  F  statistic  of  26.59. 
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The  ARIMA  model  for  Contractor  E  was  also  .unique  in  the  fact  that  it  was  the 
only  contractor  model  which  utilized  a  moving  average  (MA)  process  to  describe  the 
data  trend.  This  model  was  difficult  to  identify  and  was  actually  determined  through  a 
process  of  elimination.  Even-  reasonable  model  was  analyzed  and  the  ARIMA 
(0,1,1)  x  (0.0,0)  i  was  chosen  on  the  basis  of  the  statistical  significance  of  the  coefficient 
and  the  model's  performance  as  a  forecasting  tool.  Several  models  which  actually 
produced  slightly  better  prediction  results  were  excluded  from  consideration  because 
the  resulting  coefficients  were  not  significant.  The  multivariate  Box-Jenkins  model 
could  not  be  developed.  The  prediction  results  of  the  regression  and  Box-Jenkins 
models  are  presented  in  Table  9  . 


F.       CONTRACTOR  F 

Contractor  F  was  the  only  contractor  to  supply  data  covering  the  entire 
observation  period.  Figures  4.11  and  4.12  and  Table  10  are  provided  for  Contractor  F. 
The  overhead  cost  and  direct  personnel  series  display  similar  trends.  Both  series 
increase  until  the  ninth  period,  then  decrease  in  general  and  eventually  resume 
increasing.  The  overhead  cost  trend  resumes  increasing  in  the  sixteenth  quarter.  This 
latter  trend  in  the  overhead  cost  series  displays  a  seasonal  fluctuation  with  the 
generally  increasing  trend.  The  direct  personnel  series  decreases  until  the  sixteenth 
period  at  which  point  it  becomes  a  nearly  constant  function  for  six  quarters.  Beginning 
in  the  twenty-fourth  quarter,  the  number  of  workers  begins  a  rapidly  increasing  trend 
and  continues  in  this  manner  through  the  remainder  of  the  series.  Several  significant 
single  increases  and  decreases  occur  within  the  direct  personnel  trend.  Three  of  these 
rapid  changes  in  the  number  of  direct  workers  employed  by  Contractor  F  border  above 
and  below  the  3600  to  3800  interval  in  the  work  force  level.  The  absence  of  data 
observations  in  this  range  creates  two  distinct  clusters  of  observations  in  the  overhead 
cost  versus  direct  personnel  graph.  This  plot  does  not  appear  random  because  of  the 
blank  interval  between  the  two  groupings.  However,  the  plot  does  not  display  a  strong 
relationship  between  the  components  either.  The  OLS  model  has  an  RL  of  .317.  Both 
forms  of  autocorrelation,  AR(1)  and  AR(4),  were  removed  from  the  raw  data  through 
transformation  and  the  resulting  series  displays  a  direct  relationship  as  depicted  in  the 
lower  right  hand  graph  of  Figure  4.11  The  application  of  GLS  to  the  raw  data 
improves  the  model  significantly. 
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Figure  4.1 1     Regression  Analysis  Graphs  for  Contractor  F. 
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Figure  4.12    Box-Jenkins  Graphs  for  Contractor  F. 
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TABLE  10 

PREDICTION  RESULTS  FOR  CONTRACTOR  F 

REGRESSION 

BOX- 
JENKINS 

Correlation  Coefficient: 

.623 

.440 

Root  Mean  Squared  Error; 

Mean  of  the  Actuals: 

.0886 

.0823 

Mean  Absolute  Percentage 

Error: 

8.23 

6.86 

The  Box-Jenkins  model  appears  to  approximate  the  trend  well  but  underestimates 
the  magnitude  of  the  actual  values.  This  model  is  approximately  equal  to  the  GLS 
model  in  prediction  power.  The  attempted  development  of  the  multivariate  Box- 
Jenkins  model  for  Contractor  F  failed  to  indicate  a  significant  impulse  weight. 
Therefore,  the  development  of  the  model  was  not  pursued. 


G.       CONTRACTOR  G 

The  data  supplied  by  Contractor  G  spanned  twenty-four  quarters.  The  graphs  in 
Figure  4.13  indicate  that  both  the  overhead  cost  and  direct  personnel  series  follow 
similar  patterns  during  the  twenty-four  quarters.  There  is  an  increasing  trend  which 
becomes  a  decreasing  trend  in  the  vicinity  of  the  seventh  quarter  for  both  series.  The 
overhead  cost  series  displays  the  influence  of  seasonal  fluctuations  about  the  general 
trend.  The  lower  left  hand  plot  of  overhead  cost  versus  direct  personnel  indicates  that 
the  uncorrected  data  possess  a  noticable  direct  relationship  to  each  other.  The  R 
value  for  the  OLS  model  was  .541.  After  the  data  were  corrected  for  both  AR(1)  and 
AR(4)  processes  resulting  in  a  GLS  model  with  an  R2  of  .862.  The  F  statistic  for  this 
model  was  133.155.  Table  11  illustrates  the  outstanding  predictive  results  of  this 
model.  The  Box-Jenkins  model  for  Contractor  G  is  portrayed  graphically  in  Figure 
4.14.  The  scaling  of  the  two  graphs  is  different  because  of  the  inclusion  of  the  ninety- 
five  percent  confidence  intervals  in  the  forecast  graph.    As  indicated  in  Table  11,  the 
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Figure  4.13     Regression  Analysis  Graphs  for  Contractor  G. 
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Figure  4.14    Box-Jenkins  Graphs  for  Contractor  G. 
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TABLE  11 
PREDICTION  RESULTS  FOR  CONTRACTOR  G 


Correlation  Coefficient: 

Root  Mean  Squared  Error/ 

Mean  of  the  Actuals: 
Mean  Absolute  Percentage 
Error: 


REGRESSION 

BOX- 
JENKINS 

.638 

.604 

.0262 

.0394 

1.96 

3.15 

model  is  an  effective  forecasting  tool  for  the  specified  prediction  range.  However,  the 
regression  model  is  superior  in  all  three  categories  of  comparison.  The  multivariate 
Box-Jenkins  model  could  not  be  developed  for  Contractor  G.  The  recurring  problem 
of  insignificant  impulse  weights  plagued  this  data  set  as  it  has  several  others  examined 
in  this  thesis. 
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V.  CONCLUSIONS 

The  intent  of  this  project  was  to  develop  and  compare  forecasting  models  to  be 
used  in  the  prediction  of  overhead  costs  for  seven  government  aerospace  contractors. 
This  is  part  of  a  continuing  overhead  tracking  project  at  the  Naval  Air  Systems 
Command.  Three  types  of  methodology  were  considered  as  possible  model  sources. 
These  included  least  squares  regression  models,  Box-Jenkins  methods,  and  Box-Jenkins 
transfer  functions. 

The  regression  models  which  were  examined  were  developed  to  be  used  by 
unsophisticated  forecasters  and  operate  well  on  a  microcomputer.  The  characteristics 
of  the  data  suggested  the  complications  in  model  development  which  are  associated 
with  autocorrelation.  Consequently,  these  models  test  for  two  forms  of^  autoregressive 
processes:  AR(1)  and  AR(4).  Any  necessary  adjustments  are  made  within  the  model. 
A  review  o{  several  literature  sources  indicated  that  the  predictive  capabilities  of 
regression  models  are  usually  inferior  to  those  of  the  Box-Jenkins  method  in  the  short 
run.  In  general  this  was  found  to  be  true.  However,  the  regression  models  performed 
well  in  most  cases  and  resulted  in  superior  forecasting  models  for  two  of  the 
contractors. 

The  Box-Jenkins  class  of  models  is  considerably  more  complex  than  least  squares 
regression.  Therefore,  a  more  experienced  forecaster  and  more  efficient  computer 
resources  are  required  to  employ  this  method.  The  majority  of  the  Box-Jenkins 
transfer  function  models  could  not  be  developed  for  the  data  in  this  project.  The 
difficulties  encountered  are  probably  due  to  the  small  sample  size  for  each  data  set. 
Additionally,  a  computer  package  was  not  available  which  could  perform  the  entire 
transfer  function  procedure. 

In  those  cases  in  which  these  requirements  could  be  fulfilled,  either  of  the  Box- 
Jenkins  methods  should  be  utilized.  In  the  absence  of  continuous  access  to  such 
capabilities  such  as  is  the  case  in  the  Naval  Air  Systems  Command  overhead  tracking 
project,  least  squares  regression  theory  can  be  used  to  develop  very  adequate  models 
which  produce  results  which  are  usually  far  superior  to  the  results  obtained  from  the 
prevailing  practice  of  applying  estimated  overhead  rates  to  estimated  direct  labor 
hours. 
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APPENDIX 
APL  FUNCTIONS 

The  following  APL  functions  perform  the  OLS  and  GLS  regressions.  The  GLS 
function  transforms  the  data  to  remove  AR(4)  processes  and  GLSD  makes  the  required 
transformation  for  AR(1).  The  PRED  function  computes  the  prediction  effectiveness 
measurements  for  the  OLS,  Box-Jenkins,  and  transfer  function  forecasts.  The  PRLD1. 
PRED4,  and  PRED40  functions  perform  the  same  task  for  the  regression  models  which 
required  a  transformation  due  to  the  presence  of  autocorrelation.  The  PRED1 
function  was  utilized  in  those  cases  where  only  AR(1)  processes  were  detected  during 
the  regression  model  development  stage.  The  PRED4  function  performs  a  similar  task 
for  those  cases  in  which  AR(4)  processes  are  the  only  form  of  autocorrelation.  The 
PRED40  function  makes  the  appropriate  transformations  for  both  AR(1)  and  AR(4) 
processes  regardless  of  the  order  in  which  they  were  removed. 

-  0LS\X2\Y',N\X1\XT\X\XTXI\XTY\K1\K\YHAT\YBAR\AY\ERR\ET1^ 
CI]  r THIS  FUNCTION  COMPUTES  THE  REGRESSION  STATISTICS  OF  AN 
[2]  ^INDEPENDENT  VARIABLE  AND  A  DEPENDENT  VARIABLE. 
[  3  ]  r THE  FUNCTION  SUPPLIES  THE  COLUMN  OF  ONES . 
[  4  ]  '  ENTER  THE  INDEPENDENT  VARIABLE ' 
[5]  DL+U 

C  6  ]  '  ENTER  THE  DEPENDENT  VARIABLE » 
[7]  OH+U 
[8]  X2+DL 
[9]  Y+OH 
[10]  N+pY 
[11]  Y+(N,l)pY 
[12]  Xl+Npl 
[13]  X2V(2,W)p(Xl,X2) 
[14]  X+SXT 

[15]  xrxi+!(xr+.xx) 

[16]  XTY+XT+.xY 
[17]  B+XTXI+.xXTY 
[18]  Kl+pB 
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[19] 

[20] 

[2i: 

[22: 
[23; 
[24: 

[25 

[26: 
[27: 

[28: 

[29: 

[30: 

[31: 

[32: 

[33: 

[3u: 

[35: 

[36: 

[37: 

[38: 

[39: 

[40: 

[41: 

[42: 

[43 

[44: 

[45: 

[46: 

[47: 

[48 

[49; 
[50; 

[51 

[52: 

[53: 


K«-X1[1] 

PSS«-lp  (($£)  +  .  xj?) 

S2«-PSS*(W-K) 

S«-S2*0.5 

YBAR<r(+/((N)pY))*N 

AY+((N)pY)-YBAR 

TSS+((N)pY)+.*AY 

ESS+TSS-RSS 

R2+ESS+TSS 

ADJR2+l-( (RSS* (N-K))* (TSS* (N-l ) ) ) 

COVB+SlxXTXI 

STERRA+CO VB [ 1 ; 1 ] *  0 . 5 

STERRB+COVB [ 2 ; 2 ] *  0 . 5 

fil«-B  [  1  s  1  ]  *  (  COVB  C 1 J 1  ]  *  0 .  5  ) 

TB+B [2 ; 1] *(COVB [2; 2]*0. 5) 

F+(R2*(K-1))*((1-R2)*(N-K)) 

ERR+(N)pE 

ET1+0,  ((N-D+ERR) 

EETl<r(i±(ERR-ETl))*2 

Dl+lp((+/EETl)*RSS) 

£T4«-  0  0  0  0,  ((W-4)+E7?P) 

£'ff2'4<-  ( 4  +  (ERR-ETU-  )  ) * 2 

Dn+lp((+/EET^)*RSS) 

P41<-1- (0.5x04) 

PU2<-(((^*2)x(l-(0.5xZ?^)))  +  (X*2))+((iV*2)-(i:*2)) 

P10*- (+/( 1 + (ERRxETl )))*(+/( 1 +~ 1 + (ERRxERR )) ) 

PSS    '  ,  *PSS 

ESS  ' , *ESS 

' ,*TSS 

1 

' ,vADJR2 


TSS 
S 


ADJR2 
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[54] 

]A            ',(5>5[1;]) 

[55] 

1  SIERRA       '  , v ST ERR A 

[56] 

'2     ',(*B[2;]) 

[57] 

STERRB       ' , vSTERRB 

[58] 

i 

[59] 

F            ■  ,*F 

[60] 

TA     ' , *TA 

[61] 

tb        '  ,  $j/b 

[62] 

i 

[63] 

zn    « ,$m 

[64] 

PIC    •  ,*P10 

[65] 

i 

[66] 

£4     ' ,  $D4 

[67] 

P41     ' ,<5P41 

[68] 

P4  2     '  ,$P4  2 

V 

-  c7LS;X2;7;iV;yi;y2;71c7;y2c7;X21;X22;X21c7;X22c7;Xl;Xr;X 

- 1  -  *TH is  funciion  transforms  the  data  to  remove  the  effects 

-2-  nOFAR(n).    THE  VECTORS  OF  XG  AND  YG  ARE  THE 

-  3  -  (^TRANSFORMED  VECTORS  AND  ARE  GLOBAL  VARIABLES  . 

[  4  ]  >  ENTER  THE  INDEPENDENT  VARIABLE ' 

[5]  A+U 

[  6  ]  '  ENTER  THE  DEPENDENT  VARIABLE ' 

[7]  Z^D 

[  8  ]  '  ENTER  THE  ESTIMATE  OF  P4  ' 

[9]  P4«-D 

[10]  X2+4 

[11]  Y«-Z 

[12]  W«-py 

[13]  yi^ty 

[iu]  Y2^(iv-u)^y 

[is]  yic7^yix((i-(pu*2))*o.5) 

[16]  y2G7^(U4'y)-(P4xy2) 
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[i7: 

]    YG+Y1G,Z2G 

[is: 

]    Y+(N,l)pYG 

ci9: 

]    X21*4tX2 

[2o: 

]    X22«-(W-4)+X2 

[21: 

]    X21OX21x((l-(P4*2))*0.5) 

[22: 

]    X22G^(4+X2)-(P4xX22) 

[23: 

]    XG+(kX21G,X22G) 

[24: 

]    Xl+Npl 

[25: 

]    XT+(.2,N)p(Xl,XG) 

[26: 

]   X+SXT 

[27: 

]  xrxr<-g(xr+.xx) 

[28! 

]  xty+xt+.*y 

[29: 

]    BG+XTXI+.xXTY 

[30: 

}    Kl+pBG 

[31: 

]    K+K1Z11 

[32: 

]    YHATG+X+.xBG 

[33: 

]    EG+Y-YHATG 

[3«+; 

1   RSSG+lp((§EG)+.*EG) 

[35: 

]    S2G+RSSG*(N-K) 

[36: 

]    SC<-S2G*0.5 

[37: 

]    YBARG+(+/ ((N)pYG))*N 

[38! 

]    AYC<-((;V)pYGO-rFAZ?C 

[39: 

]    TSSG+((N)pYG)+.*AYG 

[40: 

I    ESSG+TSSG-RSSG 

[41: 

]    R2G+ESSG*TSSG 

[42: 

1    ADJR2G+1  -  ( (RSSG*  (N-K )  )*  (rSSO  (iV- 

■1))) 

[43: 

]    C0VBG+S2GXXTXI 

[44: 

]    SrEflJ?AG«-C0yBG[l;l]*O.5 

[45: 

]    SrEi?/?flC«-C0yflC[2;2]*O.5 

[46: 

]    2'AC<-BCCljl]*(C07B(?[ljl]*0.5) 

[47: 

]    rBG«-EG[2;l]*(C0yBG[2;2]*O.5) 

[48: 

]    FG+(R2G*(K-l))*((l-R2G)*(N-K)) 

[tig; 

]    ERRG+(N)pEG 

[50: 

]   ET1G+0,  ((N-D+ERRG) 

[51: 

]   EETlG<r(l+(ERRG-ETlG))*2 
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[52 

[53 
[54 
[55 
[56 
[57 
[58 
[59 
[60 
[61 
[62 
[63 
[64 
[65 
[66 
[67 
[68 
[69 
[70 
[71 
[72 
[73 
[74 
[75 
[76 
[77 
[78 
[79 
[80 


ET4C<-  0  0  0  0  ,  (  (/V-4  )+E7?PC) 

EETUrG* ( 4  +  (ERRG-ETnG) ) * 2 

D4G«-lp  ( (+/EETUG)*RSSG) 

P41Ol-(0.5xD4G) 

P42C^(((tf*2)x(l-(0.5xD4GO))+(X*2))*((iV*2)-(X*2)) 

P1G+ (+/( 1 + (ERRG*ETIG )))*(+/( 1 + " 1 + (ERRG*ERRG )) ) 

PSSC    '  ,*PSSC 

£SSC    ' , vESSG 

TSSG         ' ,*TSSG 

SG  ' ,*SG 

t 

ADJR2G       ' ,*ADJR2G 

i 

4C    ',OPC[l;]) 

STERRAG      '  ,*STERRAG 

BG  ',(sPC[2;]) 

STERRBG      ' , vSTERRBG 
t 

FC     ' ,$FG 
TAG    ' ,$rAG 


! 


01G 

PIG 

D^G 

P41G 

P42G 

V 


1  ,*D1G 

1  ,*P1G 

'  ,5SZ?4G 
'  ,$P41C 
' ,$P42G 


V  GLSD ;  X2  ;  y ;  iV ;  y  1 ;  y 2  ;  X2 1 ;  X2  2  ;  XI ;  XT ;  X ;  XTXJ ;  XZT ;  XI ;  X 
[  1  ]  P  27/ IS  FUNCTION  CORRECTS  THE  REGRESSION  MODEL  FOR  THE 
[  2  ]  ft  THE  EFFECTS  OFAR(l).   THE  TRANSFORMED  VECTORS  XF  AND 


57 


[  3  ]  q  YF  ARE  GLOBAL  VARIABLES . 

[  4  ]  '  ENTER  THE  INDEPENDENT  VARIABLE ' 

C5]  on 

[6  ]  ■  ENTER  THE  DEPENDENT  VARIABLE ' 

[7]  D+D 

[8]  '  ENTER  THE  VALUE  OF  PI' 

[9]  Pl-t-Q 


[10 
[11 
[12 
[13 
[14 
[15 
[16 
[17 
[18 
[19 
[20 
[21 
[22 
[23 
[24 
[25 
[26 
[27 
[28 
[29 
[30 
[31 
[32 
[33 
[34 
[35 
[36 
[37 


X2+C 

Y+D 

N+pY 

71*0, 1+7 

72*(2V-1)*7 

71F<-71x((l-(Pl*2))*0.5) 

72F<-(l+7)-(Plx72) 

7F<-1+(71F,72F) 

7^(^,l)p7F 

X21«-0,l+X2 

X22«-(W-1)  +  X2 

X21F«-X21*((1-(P1*2))*0.5) 

X22F+(l+X2)-(PlxX22) 

XF«-1+(X21F,X22F) 

Xl+iVpl 

X2'^(2,iV)p(Xl,XF) 

x+^xt 

xrxi^-Kxr+.xx) 
xr7*xr+.x7 
SF<-xrxi+.xxr7 

Kl+pBF 
K+Klll] 

YHATF+X+.xBF 
EF+Y-YHATF 
flSSF«-lp(($FF)  +  .xFF) 

S2F<-flSSF*(W-iO 

SF<-S2F*0.5 

YBARF+  (  +/  (  (2V  ) p  JF  )  )  *N 
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[38: 
[39: 

[no: 
[41: 

[u2: 

[43 

[44: 

[45: 

[46 
[47 
[48: 

[49: 
[so: 
[51: 

[52 

[53 

[54: 
[55: 
[56: 
[57: 
[58: 

[59 

[6o: 
[6i: 

[62 
[63 

[64: 

[65: 
[66: 
[67: 
[68: 
[69: 

[70 

[7i: 

[72: 


AYF+((N)pYF)-YBARF 

TSSF+((N)pYF)+.xAYF 

ESSF+TSSF-RSSF 

R2F+ESSF+TSSF 

ADJR2F+l-(  (RSSF*  (N-K)  )*  (TSSF*  (iV-1 ) ) ) 

COVBF+S2F*XTXI 

STERRAF+CO VBF [ 1 ; 1 ] *0 . 5 

STERRBF+CO VBF [ 2 ; 2 ] *  0 . 5 

TAF+BFllill*(COVBFll;ll*0.5) 

TBF+BF12 ; 1] * (COVBFL2 ; 2] *0 . 5 ) 

FF+(R2F*(K-1))*((1-R2F)*(N-K)) 

ERRF+(N)pEF 

ET1F+0,  ((N-D+ERRF) 

EET1F+(1±(ERRF-ET1F))*2 

DIF+lp ((+/EET1F)*RSSF) 

ET^F* 0  0  0  0, ((N-K)*ERRF) 

EET^F* ( 4  + (ERRF-ETUF )  )  *  2 

DnF<rip((+/EETUF)*RSSF) 

P41F<-l-(0.5xZ?4F) 

P42F^(((.V*2)x(l-(0.5x£4F)))+(K*2))*((;V*2)-(K*2)) 

P1F+ ( +/ ( 1 + (ERRFxETIF )))*(+/ (l+"l+ {ERRF*ERRF ) ) ) 

RSSF         ' , vRSSF 

ESSE         ' , vESSF 

TSSF         ' , vTSSF 

SF  • , 5SF 

i 

ADJR2F       ' ,vADJR2F 

t 

AF    f,(*BFCl|]) 

STERRAF      • , VSTERRAF 
BF  »,($flF[2;]) 

STERRBF      '  ,  vSTERRBF 


FF 

TAF 


'  ,sFF 

1 , st^f 


59 


[73] 

<TBF 

• ,*TBF 

[7i+] 

I              t 

[75] 

'D1F 

»  ,*Z?1F 

[76] 

'P1F 

'   ,$P1F 

[77] 

i              t 

[78] 

'£i4F 

1  ,*DUF 

[79] 

'P41F 

1  ,5PlllF 

[80] 

V 

i ,$P42F 

V  P/?£T  ;  YPA  ;  YAA  ;  YPM ;  YAM ;  WC/M ;  C£W0M 
[  1  ]  aTAIS  FUNCTION  PROVIDES  THE  MEASURES  OF  PREDICTIVE 
[ 2  ]  ^EFFECTIVENESS  FOR  THE  BOX-JENKINS  MODELS . 
[  3  ]  '  ENTER  THE  ACTUAL  SERIES  VECTOR  ' 
[4]  0#2«-Q 

[5]  '  ENTER  THE  BOX-JENKINS  FORECAST  VALUES  ' 
[6]  YP«-Q 
[7]  Y4«-~4+tf#2 
[8]  N+pJA 

[9]  MPE«-((+/((|  (7P-JA))*YA))+iV)xl00 
[10]  RMSE+(U  +  /((YP-YA)*2))*N)*0.5)*((+/YA)*N) 

[ii]  yp^^(+/yp)*/v 

[12]  YAA+(+/YA)*N 
[13]  YPM+YP-YPA 

[iu]  yAw^y^-yM 

[15]  NUM+(+/(YPMxYAM))*N 

[16]  i?ffiVOM««-(((  +  /(yPM*2))*iV)x((+/(yjlM*2))*iV))*0.5 

[17]  CORR+NUM+DENOM 

[18]  ' PREDICTED  Y  ' 

[19]  yp 

[20]  •    • 

[21]  ' ACTUAL  y 

[22]  Y4 

[23]  •    • 
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[24]  'CORRELATION       ' ,*CORR 
[25]  '    ' 

[26]  'RMSE/MA  ' ,vRMSE 

[27]  ■    ' 

[28]  'MAPE  ' ,vMAPE 

V 


V  PRED 1 ;  7P4  ;  YAA  ;  YPM ;  Y4M ;  NUM ;  CEW0M 
[  1  ]  RTtflS  FUNCTION  COMPUTES  THE  MEASURES  OF  PREDICTIVE 
[ 2  ]  ^EFFECTIVENESS  FOR  THOSE  CASES  IN  WHICH  CORRECTIONS 
[3]  zFOR  FIRST  ORDER  AUTOCORRELATION  ARE  MADE .  ' 
[  4  ]  '  ENTER  THE  VECTOR  CONTAINING  THE  ACTUAL  Y  VALUES .  ' 
[5]  OH2+Q 

[  6  ]  '  ENTER  THE  VECTOR  CONTAINING  THE  ACTUAL  X  VALUES .  ' 
[7]  DL2+U 

[  8  ]  »  ENTER  THE  VALUE  OF  PI .  ' 
[9]  P«-D 

[io]  y^«-~4<t>£#2 

[11]  X«-~4<M)L2 

[12]  XJV<-4+(~5+Z?L2) 

[13]  YW<-5pO 

[14]  YP^UpO 

[15]  Ytf[l]«-l+(~5+0ff2) 

[16]  2>0 

[17]  ITERATE :L+L  +  1 

[18]  YP[L]<-YW[L  +  l]<-(PxYW[L]  )  +  ( (X[L]  -  (PxXtf[£]  ))xS) 

[19]  +(L<>3)/ ITERATE 

[20]  Ytf«-4  +  YW 

[21]  W«-pX 

[22]  YP^(PxY/V)  +  ((X-(PxXiV))xS) 

[2  3]  MAPE+ ((+/((  I  (YP-Y4))*Y4))*W)xl00 

[24]  RMSE+( ( (+/ ( (YP-Y4 )*2 ) )*N)*Q . 5  )* ( (+/YA  )*N) 

[25]  YPA«-(+/YP)*W 

[26]  YAA+(+/JA)*N 
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[27]  YPM+YP-YPA 

[2  8]  YAM+YA-YAA 

[29]  NUM+(+/ (YPM*YAM))*N 

[30]  DENOM+(((+/(YPM*2))*N)*((+/(YAM*2))*N))*0.5 

[31]  CORR+NUM*DENOM 

[32]  ' PREDICTED  Y ' 

[33]  yp 

[34]  '     ' 

[3  5]  ' ACTUAL  Y' 

[36]  YA 

[37]  '    » 

1  CORRELATION      '  ,  55CO/?i? 


[38] 
[39] 
[HO] 
[41] 
[42] 


' RMSE/MA 
t         i 

'MAPE 
V 


» , vMAPE 


V  PPED4  ;  YPA  ;  YA4  ;  YPM ;  YAM ;  iVC/M ;  DENOM 
[  1  ]  q Fff IS  FUNCTION  COMPUTES  THE  MEASURES  OF  PREDICTIVE 
[ 2  ]  ^EFFECTIVENESS  FOR  THOSE  CASES  WHERE  CORRECTIONS  FOR 
[  3  ]  ^FOURTH  ORDER  AUTOCORRELATION  ARE  MADE. 
[  4  ]  '  ENTER  A  VECTOR  CO  NT  A INING  THE  ACTUAL  Y  VAL  UES .  ' 
[5]  OH2+U 

[  6  ]  '  ENTER  THE  VECTOR  OF  X  VALUES  ' 
[7]  DL2+U 

[  8  ]  '  ENTER  THE  VALUE  OF  P4  ' 
[9]  P^D 
[10]  Y4«-~4+0ff2 
[11]  YN+'^OHl 
[12]  X«-~4  +  £L2 
[13]  XN+~n+DLl 
[14]  W+pX 
[15]  YP«-(PxYW)  +  ((X-(PxXtf))xB) 
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[16]  MAPE+(  (+/((|  (YP-YA))*YA))*N)*100 

[17]  RMSE+(((+/(aP-YA)*2))*N)*0.5j*(.(+/YA)*N) 

[18]  7PA^(+/yP)r/V 

[19]  YAA«-(+/YA)*iV 

[20]  YPM+YP-YPA 

[21]  Y.4M-Y.4-Y4/L 

[22]  NUM+(+/(YPM*YAM))iN 

[23]  D£,iVOW^-(((+/(YPM*2))*W)x((+/(YAW*2))*W))*0.5 

[2  4]  CORR+NUM+DENOM 

[2  5]  « PREDICTED  Y' 


[26] 

YP 

[27] 

!       ! 

[28] 

'ACTUAL  Y' 

[29] 

Y4 

[30] 

t    t 

[31] 

" CORRELATION       ' , sCOPP 

[32] 

t    i 

[33] 

' RMSE/MA           ' , $PMS£ 

[34] 

t    t 

[35] 

'  AMP£      •  ,  vMAPE 

V 

V  PPFC4  0  ;  YPA  ;  Y44  ;  YPM ;  YAM ;  iVtfM ;  DENOM 
[  1  ]  fl 27? 75  FUNCTION  COMPUTES  THE  MEASURES  OF  PREDICTIVE 
[  2  ]  r  EFFECTIVENESS  FOR  THOSE  MODELS  CORRECTING  FOR 
[3  ]  nFIRST  ORDER  AND  FOURTH  ORDER  AUTOCORRELATION : 
[  4  ]  '  ENTER  THE  VECTOR  CONTAINING  ACTUAL  Y  VALUES  » 
[5]  OH2+U 

[  6  ]  •  ENTER  THE  VECTOR  CONTAINING  X  VALUES  • 
[7]  DL2+U 

[  8  ]  '  ENTER  A  VALUE  FOR  P4  * 
[9]  P4«-Q 

[10]  '  ENTER  THE  VALUE  FOR  PI ' 
[11]  Pl+D 
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[12]  «  ENTER  THE  VECTOR  CONTAINING  THE  PREDICTED  VALUES  ' 

[13]  YP+U 

[14]  YA+~mOH2 

[15]  X<-~4+£)L2 

[16]  XTl+n+C5+DL2) 

[17]  XTV+mCS+DL2) 

[18]  Xr5«-4+("9+0L2) 

[19]  xs«-sx ( ( (x- (puxxru ) )- (pixxn ) )+(Pixp4xxrs ) ) 

[20]  YP«-4 pO 

[21]  Yn«-5p0 

[22]  Yri[l]-«-l+(~5*0#2) 

[23]  Yr4«-4*(~8+0#2) 

[24]  Yr5«-4<K~9+tftf2) 

[25]  Z>0 

[26]  ITERATE :L+L  +  1 

[27]  YP[L>Yri[L  +  l]«-(((PlxYri[L] )+(P4xyr4[L] ) )- (PlxP4xyr5 [L] ))+X2 

[28]  *(L£3)/ 'ITERATE 

[29]  AM-pX 

[3  0]  MAPE* ((+/((  I  (YP-YA))  +  YA))*A/)xlOO 

[31]  PWS£'^-(((+/((YP-YA)*2))*iV)*0.5)+((+/YA)T^) 

[32]  YP4«-(+/YP)*W 

[33]  YAA+(+/YA)iN 

[3  4]  YPM+YP-YPA 

[3  5]  YAM^YA-YAA 

[3  6]  NUM+(+/(YPM*YAM))*N 

[3  7]  OffiVOM-*-(((+/(JPM*2))*iV)x((+/(yAM*2))*iV))*0.5 

[3  8]  CORR+NUM+DENOM 

[3  9]  ' PREDICTED  Y1 

[40]  YP 

[41]  '     ' 

[42]  ' ACTUAL Y' 

[43]  YA 

[44]  '     ' 

[4  5]  'CORRELATION       ' ,*CORR 


[46 


i     t 
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[47]  iRMSE/MA  ' ,*RMSE 

[48]  '    ' 

[49]  *MAPE  ' ,*MAPE 

V 
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